39 research outputs found

    Interpretable Architectures and Algorithms for Natural Language Processing

    Get PDF
    Paper V is excluded from the dissertation with respect to copyright.This thesis has two parts: Firstly, we introduce the human level-interpretable models using Tsetlin Machine (TM) for NLP tasks. Secondly, we present an interpretable model using DNNs. The first part combines several architectures of various NLP tasks using TM along with its robustness. We use this model to propose logic-based text classification. We start with basic Word Sense Disambiguation (WSD), where we employ TM to design novel interpretation techniques using the frequency of words in the clause. We then tackle a new problem in NLP, i.e., aspect-based text classification using a novel feature engineering for TM. Since TM operates on Boolean features, it relies on Bag-of-Words (BOW), making it difficult to use pre-trained word embedding like Glove, word2vec, and fasttext. Hence, we designed a Glove embedded TM to significantly enhance the model’s performance. In addition to this, NLP models are sensitive to distribution bias because of spurious correlations. Hence we employ TM to design a robust text classification against spurious correlations. The second part of the thesis consists interpretable model using DNN where we design a simple solution for complex position dependent NLP task. Since TM’s interpretability comes with the cost of performance, we propose an DNN-based architecture using a masking scheme on LSTM/GRU based models that ease the interpretation for humans using the attention mechanism. At last, we take the advantages of both models and design an ensemble model by integrating TM’s interpretable information into DNN for better visualization of attention weights. Our proposed model can be efficiently integrated to have a fully explainable model for NLP that assists trustable AI. Overall, our model shows excellent results and interpretation in several open-sourced NLP datasets. Thus, we believe that by combining the novel interpretation of TM, the masking technique in the neural network, and the integrated ensemble model, we can build a simple yet effective platform for explainable NLP applications wherever necessary.publishedVersio

    Enhancing Attention’s Explanation Using Interpretable Tsetlin Machine

    Get PDF
    Explainability is one of the key factors in Natural Language Processing (NLP) specially for legal documents, medical diagnosis, and clinical text. Attention mechanism has been a popular choice for such explainability recently by estimating the relative importance of input units. Recent research has revealed, however, that such processes tend to misidentify irrelevant input units when explaining them. This is due to the fact that language representation layers are initialized by pretrained word embedding that is not context-dependent. Such a lack of context-dependent knowledge in the initial layer makes it difficult for the model to concentrate on the important aspects of input. Usually, this does not impact the performance of the model, but the explainability differs from human understanding. Hence, in this paper, we propose an ensemble method to use logic-based information from the Tsetlin Machine to embed it into the initial representation layer in the neural network to enhance the model in terms of explainability. We obtain the global clause score for each word in the vocabulary and feed it into the neural network layer as context-dependent information. Our experiments show that the ensemble method enhances the explainability of the attention layer without sacrificing any performance of the model and even outperforming in some datasets.publishedVersio

    A Lite Romanian BERT:ALR-BERT

    Get PDF
    Large-scale pre-trained language representation and its promising performance in various downstream applications have become an area of interest in the field of natural language processing (NLP). There has been huge interest in further increasing the model’s size in order to outperform the best previously obtained performances. However, at some point, increasing the model’s parameters may lead to reaching its saturation point due to the limited capacity of GPU/TPU. In addition to this, such models are mostly available in English or a shared multilingual structure. Hence, in this paper, we propose a lite BERT trained on a large corpus solely in the Romanian language, which we called “A Lite Romanian BERT (ALR-BERT)”. Based on comprehensive empirical results, ALR-BERT produces models that scale far better than the original Romanian BERT. Alongside presenting the performance on downstream tasks, we detail the analysis of the training process and its parameters. We also intend to distribute our code and model as an open source together with the downstream task.publishedVersio

    Growth trends and forecasting of fish production in Assam, India using ARIMA model

    Get PDF
    Fish is an essential component of the diet of the most populace in Assam and fish farming has been one of the sources of livelihood in rural areas. Assam ranks first in fish production among North-eastern states of India. However, fish production is not sufficient to meet the demand despite having vast aquatic resources in the state. The present study was undertaken to determine the decadal growth of fish production in the state using the compound growth rate. The study also attempted modeling and forecasting of fish production in Assam using Auto-Regressive Integrated Moving Average (ARIMA) methodology. For the present study, time-series data on fish production in Assam from 1980-81 to 2018-19 was obtained from the Directorate of Fisheries, Government of Assam. Data for the period 1980-81 to 2014-15 was utilized to build an ARIMA model and validated through the remaining data from 2015-16 to 2018-19. The best suitable model for the state’s fish production was ARIMA (1,1,0) based on the values of the model selection criterion. The actual fish production and forecast values using a fitted model were in close agreement. The out-of-sample forecast values of fish production in the state for the subsequent years 2019-20 to 2022-23 showed an increasing trend from 336.97 to 358.21 thousand metric tonnes. Considering the vast aquatic resources in the state, the study calls for serious attention by policymakers, researchers, and developmental agencies for harnessing the potential of fisheries resources for making the North-east region self-sufficient in fish production as a whole and Assam in particular.

    Robust Interpretable Text Classification against Spurious Correlations Using AND-rules with Negation

    Get PDF
    The state-of-the-art natural language processing models have raised the bar for excellent performance on a variety of tasks in recent years. However, concerns are rising over their primitive sensitivity to distribution biases that reside in the training and testing data. This issue hugely impacts the performance of the models when exposed to out-of-distribution and counterfactual data. The root cause seems to be that many machine learning models are prone to learn the shortcuts, modelling simple correlations rather than more fundamental and general relationships. As a result, such text classifiers tend to perform poorly when a human makes minor modifications to the data, which raises questions regarding their robustness. In this paper, we employ a rule-based architecture called Tsetlin Machine (TM) that learns both simple and complex correlations by ANDing features and their negations. As such, it generates explainable AND-rules using negated and non-negated reasoning. Here, we explore how non-negated reasoning can be more prone to distribution biases than negated reasoning. We further leverage this finding by adapting the TM architecture to mainly perform negated reasoning using the specificity parameter s. As a result, the AND-rules becomes robust to spurious correlations and can also correctly predict counterfactual data. Our empirical investigation of the model's robustness uses the specificity s to control the degree of negated reasoning. Experiments on publicly available Counterfactually-Augmented Data demonstrate that the negated clauses are robust to spurious correlations and outperform Naive Bayes, SVM, and Bi-LSTM by up to 20 %, and ELMo by almost 6 % on counterfactual test data.acceptedVersio

    Indoor Space Classification Using Cascaded LSTM

    Get PDF
    Author's accepted manuscript.© 2020 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.Indoor space classification is an important part of localization that helps in precise location extraction, which has been extensively utilized in industrial and domestic domain. There are various approaches that employ Bluetooth Low Energy (BLE), Wi-Fi, magnetic field, object detection, and Ultra Wide Band (UWB) for indoor space classification purposes. Many of the existing approaches need extensive pre-installed infrastructure, making the cost higher to obtain reasonable accuracy. Therefore, improvements are still required to increase the accuracy with minimum requirements of infrastructure. In this paper, we propose an approach to classify the indoor space using geomagnetic field (GMF) and radio signal strength (RSS) as the identity. The indoor space is an open big test bed divided into different indiscernible subspace. We collect GMF and RSS at each subspace and classify it using cascaded Long Short Term Memory (LSTM). The experimental results show that the accuracy is significantly improved when GMF and RSS are combined to make distinct features. In addition, we compare the performance of the proposed model with the state-of-the-art machine learning methods.acceptedVersio

    Cyclostationary Random Number Sequences for the Tsetlin Machine

    Get PDF
    Author's accepted manuscriptThe Tsetlin Machine (TM) constitutes an emerging machine learning algorithm that has shown competitive performance on several benchmarks. The underlying concept of the TM is propositional logic determined by a group of finite state machines that learns patterns. Thus, TM-based systems naturally lend themselves to low-power operation when implemented in hardware for micro-edge Internet-of-Things applications. An important aspect of the learning phase of TMs is stochasticity. For low-power integrated circuit implementations the random number generation must be carried out efficiently. In this paper, we explore the application of pre-generated cyclostationary random number sequences for TMs. Through experiments on two machine learning problems, i.e., Binary Iris and Noisy XOR, we demonstrate that the accuracy is on par with standard TM. We show that through exploratory simulations the required length of the sequences that meets the conflicting tradeoffs can be suitably identified. Furthermore, the TMs achieve robust performance against reduced resolution of the random numbers. Finally, we show that maximum-length sequences implemented by linear feedback shift registers are suitable for generating the required random numbers.acceptedVersio

    US News and Social Media Framing around Vaping

    Get PDF
    In this paper, we investigate how vaping is framed differently (2008-2021) between US news and social media. We analyze 15,711 news articles and 1,231,379 Facebook posts about vaping to study the differences in framing between media varieties. We use word embeddings to provide two-dimensional visualizations of the semantic changes around vaping for news and for social media. We detail that news media framing of vaping shifted over time in line with emergent regulatory trends, such as; flavored vaping bans, with little discussion around vaping as a smoking cessation tool. We found that social media discussions were far more varied, with transitions toward vaping both as a public health harm and as a smoking cessation tool. Our cloze test, dynamic topic model, and question answering showed similar patterns, where social media, but not news media, characterizes vaping as combustible cigarette substitute. We use n-grams to detail that social media data first centered on vaping as a smoking cessation tool, and in 2019 moved toward narratives around vaping regulation, similar to news media frames. Overall, social media tracks the evolution of vaping as a social practice, while news media reflects more risk based concerns. A strength of our work is how the different techniques we have applied validate each other. Stakeholders may utilize our findings to intervene around the framing of vaping, and may design communications campaigns that improve the way society sees vaping, thus possibly aiding smoking cessation; and reducing youth vaping
    corecore